Evidence for the Pareto principle in Open Source Software Activity
نویسندگان
چکیده
Numerous empirical studies analyse evolving open source software (OSS) projects, and try to estimate the activity and effort in these projects. Most of these studies, however, only focus on a limited set of artefacts, being source code and defect data. In our research, we extend the analysis by also taking into account mailing list information. The main goal of this article is to find evidence for the Pareto principle in this context, by studying how the activity of developers and users involved in OSS projects is distributed: it appears that most of the activity is carried out by a small group of people. Following the GQM paradigm, we provide evidence for this principle. We selected a range of metrics used in economy to measure inequality in distribution of wealth, and adapted these metrics to assess how OSS project activity is distributed. Regardless of whether we analyse version repositories, bug trackers, or mailing lists, and for all three projects we studied, it turns out that the distribution of activity is highly imbalanced.
منابع مشابه
The Vital Few and Trivial Many: An Empirical Analysis of the Pareto Distribution of Defects
The Pareto Principle is a universal principle of the “vital few and trivial many”. According to this principle, the 80/20 rule has been formulated with the following meaning: For many phenomena, 80% of the consequences originate from 20% of the causes. In this paper, we applied the Pareto Principle to software testing and analysed 9 open source projects (OSPs) across several releases. The resul...
متن کاملOn the probability distribution of faults in complex software systems
Context. There are several empirical principles related to the distribution of faults in a software system (e.g. the Pareto principle) widely applied in practice and thoroughly studied in the software engineering research providing evidence in their favor. However, the knowledge of the underlying probability distribution of faults, that would enable a systematic approach and refinement of these...
متن کاملAn empirical study of package coupling in Java open-source
Excessive coupling between object-oriented classes in systems is generally acknowledged as harmful and is recognised as a maintenance problem that can result in a higher propensity for faults in systems and a „stored up‟ future problem. Characterisation and understanding coupling at different levels of abstraction is therefore important for both the project manager and developer both of whom ha...
متن کاملMinimizing the total tardiness and makespan in an open shop scheduling problem with sequence-dependent setup times
We consider an open shop scheduling problem with setup and processing times separately such that not only the setup times are dependent on the machines, but also they are dependent on the sequence of jobs that should be processed on a machine. A novel bi-objective mathematical programming is designed in order to minimize the total tardiness and the makespan. Among several mult...
متن کاملSalt: Combining ACID and BASE in a Distributed Database
This paper presents Salt, a distributed database that allows developers to improve the performance and scalability of their ACID applications through the incremental adoption of the BASE approach. Salt’s motivation is rooted in the Pareto principle: for many applications, the transactions that actually test the performance limits of ACID are few. To leverage this insight, Salt introduces BASE t...
متن کامل